Learning and Detecting Concept Drift

نویسنده

Kyosuke Nishida

چکیده

The volume of data that humans create has increased explosively as information science and technology have evolved. Therefore, the demand for learning machines that can extract input-output mappings and knowledge rules from massive data sets has become more urgent, and machine learning is now a core technology in the advanced information society. It has been applied to fields such as pattern recognition, search engines, medical support, robot engineering, image processing, and data mining and has achieved significant accomplishments in each field. Recently, several methods that could not be implemented with older computers have been developed with state-ofthe-art computers that have enormous memory capacity and high performance CPUs. Machine learning is expected to continue to develop in the future. Machine learning can be roughly classified into two types of learning based on how training examples are presented: batch learning and online learning. Batch learning systems are first given a large number of examples that they then learn all at a once. In contrast, online learning systems are given examples sequentially that they learn one by one. Many excellent batch learning systems have been proposed, but there are serious problems with online learning, especially in environments where the statistical properties of the target variable change over time. This change, known as concept drift, can happen either gradually or suddenly and significantly. The effectiveness of strategies for building a good learning system depends on the types of changes, so it is difficult to create an ideal learning system. A prime example of concept drift is the spam filtering problem. An effective spam filter must be able to handle various

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

Concept drift detection in business process logs using deep learning

Process mining provides a bridge between process modeling and analysis on the one hand and data mining on the other hand. Process mining aims at discovering, monitoring, and improving real processes by extracting knowledge from event logs. However, as most business processes change over time (e.g. the effects of new legislation, seasonal effects and etc.), traditional process mining techniques ...

متن کامل

New Ensemble Method for Classification of Data Streams

Classification of data streams has become an important area of data mining, as the number of applications facing these challenges increases. In this paper, we propose a new ensemble learning method for data stream classification in presence of concept drift. Our method is capable of detecting changes and adapting to new concepts which appears in the stream. Data stream classification; concept d...

متن کامل

Concept Drift Detection Through Resampling

Detecting changes in data-streams is an important part of enhancing learning quality in dynamic environments. We devise a procedure for detecting concept drifts in data-streams that relies on analyzing the empirical loss of learning algorithms. Our method is based on obtaining statistics from the loss distribution by reusing the data multiple times via resampling. We present theoretical guarant...

متن کامل

A Simple Unlearning Framework for Online Learning Under Concept Drifts

Real-world online learning applications often face data coming from changing target functions or distributions. Such changes, called the concept drift, degrade the performance of traditional online learning algorithms. Thus, many existing works focus on detecting concept drift based on statistical evidence. Other works use sliding window or similar mechanisms to select the data that closely ref...

متن کامل

Modeling Concept Drift: A Probabilistic Graphical Model Based Approach

An often used approach for detecting and adapting to concept drift when doing classification is to treat the data as i.i.d. and use changes in classification accuracy as an indication of concept drift. In this paper, we take a different perspective and propose a framework, based on probabilistic graphical models, that explicitly represents concept drift using latent variables. To ensure efficie...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Learning and Detecting Concept Drift

نویسنده

چکیده

منابع مشابه

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Concept drift detection in business process logs using deep learning

New Ensemble Method for Classification of Data Streams

Concept Drift Detection Through Resampling

A Simple Unlearning Framework for Online Learning Under Concept Drifts

Modeling Concept Drift: A Probabilistic Graphical Model Based Approach

عنوان ژورنال:

اشتراک گذاری